Compressing Relations and Indexes

نویسندگان

Jonathan Goldstein

Raghu Ramakrishnan

Uri Shaft

چکیده

We propose a new compression algorithm that is tailored to database applications It can be applied to a collection of records and is especially e ective for records with many low to medium cardinality elds and numeric elds In addition this new technique sup ports very fast decompression Promising application domains include decision sup port systems DSS since fact tables which are by far the largest tables in these applications contain many low and medium cardinality elds and typically no text elds Further our decompression rates are faster than typical disk throughputs for sequential scans in con trast gzip is slower This is important in DSS appli cations which often scan large ranges of records An important distinguishing characteristic of our algorithm in contrast to compression algorithms pro posed earlier is that we can decompress individual tu ples even individual elds rather than a full page or an entire relation at a time Also all the infor mation needed for tuple decompression resides on the same page with the tuple This means that a page can be stored in the bu er pool and used in compressed form simplifying the job of the bu er manager and improving memory utilization Our compression algorithm also improves index structures such as B trees and R trees signi cantly by reducing the number of leaf pages and compressing in dex entries which greatly increases the fan out We can also use lossy compression on the internal nodes of an index

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Indexing Variation Graphs

Variation graphs, which represent genetic variation within a population, are replacing sequences as reference genomes. Path indexes are one of the most important tools for working with variation graphs. They generalize text indexes to graphs, allowing one to find the paths matching the query string. We propose using de Bruijn graphs as path indexes, compressing them by merging redundant subgrap...

متن کامل

Squeezing the Most out of Relational Database Systems

With the increasing speed of CPUs relative to disks, using compression as a means of improving disk information throughput is becoming very attractive. Traditional compression algorithms such as Lempel-Ziv, which is the basis of the standard gzip compression package, are inadequate for compressing relations in a relational database system. This inadequacy is derived from two problems. The first...

متن کامل

MRCSI: Compressing and Searching String Collections with Multiple References

Efficiently storing and searching collections of similar strings, such as large populations of genomes or long change histories of documents from Wikis, is a timely and challenging problem. Several recent proposals could drastically reduce space requirements by exploiting the similarity between strings in so-called referencebased compression. However, these indexes are usually not searchable an...

متن کامل

Efficient Index Compression in DB2 LUW

In database systems, the cost of data storage and retrieval are important components of the total cost and response time of the system. A popular mechanism to reduce the storage footprint is by compressing the data residing in tables and indexes. Compressing indexes efficiently, while maintaining response time requirements, is known to be challenging. This is especially true when designing for ...

متن کامل

بررسی کمّی تأثیر خشک‌سالی بر عملکرد محصول جو در آذربایجان شرقی به روش رگرسیونی چندمتغیره

The growing season climatic parameters, especially rainfall, play the main role to predict the yield production. Therefore, the main objective of this research was to find out some possible relations among meteorology parameters and drought indexes with the yield using classical statistical methods. To achieve the objective, ten meteorological parameters and twelve drought indexes were evaluate...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Compressing Relations and Indexes

نویسندگان

چکیده

منابع مشابه

Indexing Variation Graphs

Squeezing the Most out of Relational Database Systems

MRCSI: Compressing and Searching String Collections with Multiple References

Efficient Index Compression in DB2 LUW

بررسی کمّی تأثیر خشک‌سالی بر عملکرد محصول جو در آذربایجان شرقی به روش رگرسیونی چندمتغیره

عنوان ژورنال:

اشتراک گذاری